Automatic Phonetic Transcription of Non − Prompted Speech
نویسنده
چکیده
Automatic Segmentation" (MAUS) system labels and segments the phonetic constituents of spoken German in a manner similar to highly trained phoneticians. MAUS has been used to train automatic speech recognition (ASR) systems as well as to provide detailed statistical analyses of spontaneous speech (using the Verbmobil I and RVG I corpora). The MAUS system is a reliable, automatic means of testing linguistic hypotheses concerning the phonetic properties of spontaneous speech and should therefore play an important role in providing the sort of empirical data required to develop more realistic models of spoken language. 1. INTRODUCTION In many cases our scientific work with recorded non− prompted or even spontaneous German during the last 5 years ended in results that often differ from our text book knowledge of German phonetics. In the light of these observations it is my opinion that the speech sciences including phonetics should follow a new way (beside the traditional ways that are of course still to be pursued!) to comply with the problem that often the scientific models of speech differ significantly from reality. Therefore, in part 2 I will give some arguments for computational methods on the basis of large purpose− independent speech corpora. To give an example of this type of work the third section gives a brief description of the 'Munich Automatic Segmentation' (MAUS) method, while the last part will give three examples where results from MAUS were used in different experiments or applications. The first example is a statistical evaluation of well known assimilation processes at word boundaries; the second and third example describe experiments to improve Automatic Speech Recognition (ASR) by exploiting the knowledge about pronunciation from the MAUS segmentation.
منابع مشابه
Automatic phonetic transcription of large speech corpora: a comparative study
This study investigates whether automatic transcription procedures can approximate manual phonetic transcriptions typically delivered with contemporary large speech corpora. We used ten automatic procedures to generate a broad phonetic transcription of well-prepared speech (read-aloud texts) and spontaneous speech (telephone dialogues). The resulting transcriptions were compared to manually ver...
متن کاملTitle : Automatic Phonetic Transcription of Large Speech Corpora
Most large speech corpora are delivered with a lexicon that contains a canonical transcription of every word in the orthographic transcription. Such a lexicon can be used for generating a hypothetical ‘canonical’ phonetic transcription from the orthography. In addition, time and money permitting, some speech corpora are provided with a manually verified broad phonetic transcription of at least ...
متن کاملImproving Automatic Phonetic Transcription of Spontaneous Speech Through Variant-Based Pronunciation Variation Modelling
In this paper we present an experiment aimed at improving automatic phonetic transcription of Dutch spontaneous speech through a variant-based method of pronunciation variation modelling. For spontaneous speech, the literature does not always provide enough rules to describe its characteristic phonological processes. Therefore, other methods should be applied to model pronunciation variation fo...
متن کاملAutomatic phonetic transcription of large speech corpora
This study is aimed at investigating whether automatic phonetic transcription procedures can approximate manual transcriptions typically delivered with contemporary large speech corpora. To this end, ten automatic procedures were used to generate a broad phonetic transcription of well-prepared speech (read-aloud texts) and spontaneous speech (telephone dialogues) from the Spoken Dutch Corpus. T...
متن کاملData Driven Approaches to Phonetic Transcription with Integration of Automatic Speech Recognition and Grapheme-to-Phoneme for Spoken Buddhist Sutra
We propose a new approach for performing phonetic transcription of text that utilizes automatic speech recognition (ASR) to help traditional grapheme-to-phoneme (G2P) techniques. This approach was applied to transcribe Chinese text into Taiwanese phonetic symbols. By augmenting the text with speech and using automatic speech recognition with a sausage searching net constructed from multiple pro...
متن کامل